Day 10 Azure cognitive service: image description- 看圖說故事

2021 iThome 鐵人賽

DAY 10

AI & Data

我不太懂 AI，可是我會一點 Python 和 Azure系列第 10 篇

13th鐵人賽 microsoft azure azure azure cognitive services

Ben

團隊能去健身房後發現硬舉退步一百公斤的五隻雞

2021-09-10 07:00:12

4201 瀏覽

分享至

Azure cognitive service: image description- 看圖說故事

a cat sleeping on a wooden structure

Image Description 影像描述，顧名思義是利用電腦視覺分析影像，產生出人類看得懂的句子，以描述圖片中的內容。通常這樣的作法被稱為 Image Captioning，意味著幫影像下標題。其原理大致上是以 Convolutional Neural Network- CNN，作為 Encoder 擷取圖片中的特徵，在透過 Recurrent neural network- RNN，作為 Decoder 生成文句。

金鑰與端點

這邊用的金鑰和端點跟之前物體偵測所用的是一樣的。

進入https://portal.azure.com/#home
點選所有資源
點選剛剛建立的電腦視覺服務
點選金鑰與端點
複製金鑰與端點

示範程式

# 套件：azure-cognitiveservices-vision-computervision
from azure.cognitiveservices.vision.computervision \
import ComputerVisionClient
from msrest.authentication import (
  CognitiveServicesCredentials
)
# 利用金鑰SUBSCRIPTION_KEY和端點ENDPOINT，取得使用電腦視覺服務的權限。
SUBSCRIPTION_KEY = "YOUR SUBSCRIPTION_KEY"
ENDPOINT = "YOUR ENDPOINT"
CV_CLIENT = ComputerVisionClient(
    ENDPOINT, CognitiveServicesCredentials(SUBSCRIPTION_KEY)
)

# 利用 describe_image 取得描述影像的句子與信心程度

description_results = CV_CLIENT.describe_image(url)
output = ""
for caption in description_results.captions:
    output += "'{}' with confidence {:.2f}% \n".format(
        caption.text, caption.confidence * 100
    )
print(output)

把物體偵測和影像描述結合在一起，加到 chatbot server 之中，就可以得到以下效果。下一篇，我們就可以把之前所說的各項功能綜合起來，完成可以看圖說故事的 chatbot。